Parallelizing the Data Cube 183 –

نویسنده

  • Mohammed J. Zaki
چکیده

This paper presents a general methodology for the efficient parallelization of existing data cube construction algorithms. We describe two different partitioning strategies, one for top-down and one for bottomup cube algorithms. Both partitioning strategies assign subcubes to individual processors in such a way that the loads assigned to the processors are balanced. Our methods reduce inter processor communication overhead by partitioning the load in advance instead of computing each individual group-by in parallel. Our partitioning strategies create a small number of coarse tasks. This allows for sharing of prefixes and sort orders between different group-by computations. Our methods enable code reuse by permitting the use of existing sequential (external memory) data cube algorithms for the subcube computations on each processor. This supports the transfer of optimized sequential data cube code to a parallel setting. The bottom-up partitioning strategy balances the number of single attribute external memory sorts made by each processor. The top-down strategy partitions a weighted tree in which weights reflect algorithm specific cost measures like estimated group-by sizes. Both partitioning approaches can be implemented on any shared disk type parallel machine composed of p processors connected via an interconnection fabric and with access to a shared parallel disk array. We have implemented our parallel top-down data cube construction method in C++ with the MPI message passing library for communication and the LEDA library for the required graph algorithms. We tested our code on an eight processor cluster, using a variety of different data sets with a range of sizes, dimensions, density, and skew. Comparison tests were performed on a SunFire 6800. The tests show that our partitioning strategies generate a close to optimal load balance between processors. The actual run times observed show an optimal speedup of p.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Algorithms for Solving Rubik's Cubes

The Rubik’s Cube is perhaps the world’s most famous and iconic puzzle, well-known to have a rich underlying mathematical structure (group theory). In this paper, we show that the Rubik’s Cube also has a rich underlying algorithmic structure. Specifically, we show that the n×n×n Rubik’s Cube, as well as the n×n×1 variant, has a “God’s Number” (diameter of the configuration space) of Θ(n/ logn). ...

متن کامل

Extension of Cube Attack with Probabilistic Equations and its Application on Cryptanalysis of KATAN Cipher

Cube Attack is a successful case of Algebraic Attack. Cube Attack consists of two phases, linear equation extraction and solving the extracted equation system. Due to the high complexity of equation extraction phase in finding linear equations, we can extract nonlinear ones that could be approximated to linear equations with high probability. The probabilistic equations could be considered as l...

متن کامل

Cube Attacks on Non-Blackbox Polynomials Based on Division Property (Full Version)

The cube attack is a powerful cryptanalytic technique and is especially powerful against stream ciphers. Since we need to analyze the complicated structure of a stream cipher in the cube attack, the cube attack basically analyzes it by regarding it as a blackbox. Therefore, the cube attack is an experimental attack, and we cannot evaluate the security when the size of cube exceeds an experiment...

متن کامل

Interactive Rendering of Trees with Shading and Shadows

The goal of this paper is the interactive rendering of 3D trees covering a landscape, with shading and shadows consistent with the lighting conditions. We propose a new IBR representation, consisting of a hierarchy of Bidirectional Textures, which resemble 6D lightfields. A hierarchy of visibility cube-maps is associated to this representation to improve the performance of shadow calculations. ...

متن کامل

Numerical Study of Reynolds Number Effects on Flow over a Wall-Mounted Cube in a Channel Using LES

Turbulent flow over wall-mounted cube in a channel was investigated numerically using Large Eddy Simulation. The Selective Structure Function model was used to determine eddy viscosity that appeared in the subgrid scale stress terms in momentum equations. Studies were carried out for the flows with Reynolds number ranging from 1000 to 40000. To evaluate the computational results, data was compa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002